fix: propagate frame-level data modifier failures#5709
Conversation
DeepmdData.get_single_frame ran the data modifier through a one-off ThreadPoolExecutor but never called future.result(). The context manager waited for the task to finish, but an exception raised inside modify_data stayed stored on the Future and was never surfaced, so callers received the unmodified frame and, with caching enabled, that bad frame could be cached for future use. Call future.result() before leaving the modifier block so modifier exceptions propagate. Add a test with a raising modifier asserting get_single_frame re-raises and does not cache the failed frame. Fix deepmodeling#5690
📝 WalkthroughWalkthroughDeepmdData.get_single_frame now calls future.result() after submitting the modifier's modify_data task in a ThreadPoolExecutor, propagating exceptions instead of silently ignoring them. A new unit test verifies the exception propagates and the failing frame is not cached. ChangesModifier error propagation
Estimated code review effort: 1 (Trivial) | ~5 minutes 🚥 Pre-merge checks | ✅ 5✅ Passed checks (5 passed)
✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
🧹 Nitpick comments (1)
deepmd/utils/data.py (1)
526-535: 🚀 Performance & Scalability | 🔵 Trivial | ⚡ Quick winConsider dropping the single-task ThreadPoolExecutor.
Only one task is submitted here, then immediately awaited via
future.result()— the executor buys no parallelism, just thread-pool setup/teardown cost perget_single_framecall (a hot path viapreload_and_modify_all_data_torch). Other modifier call sites in this class (get_test,_load_batch_set) callmodify_datadirectly.♻️ Proposed simplification
if self.modifier is not None: - with ThreadPoolExecutor(max_workers=num_worker) as executor: - # Apply modifier if it exists - future = executor.submit( - self.modifier.modify_data, - frame_data, - self, - ) - # propagate any exception raised inside the modifier instead of - # silently returning (and caching) an unmodified frame - future.result() + # Apply modifier directly; exceptions propagate naturally + self.modifier.modify_data(frame_data, self) if self.use_modifier_cache: # Cache the modified frame to avoid recomputation self._modified_frame_cache[index] = copy.deepcopy(frame_data)🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@deepmd/utils/data.py` around lines 526 - 535, The single-task ThreadPoolExecutor in get_single_frame adds setup/teardown overhead without any parallelism because only one modify_data call is submitted and immediately awaited. Simplify get_single_frame by calling self.modifier.modify_data directly, matching the existing get_test and _load_batch_set call sites, while still preserving the exception propagation behavior currently achieved by future.result().
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Nitpick comments:
In `@deepmd/utils/data.py`:
- Around line 526-535: The single-task ThreadPoolExecutor in get_single_frame
adds setup/teardown overhead without any parallelism because only one
modify_data call is submitted and immediately awaited. Simplify get_single_frame
by calling self.modifier.modify_data directly, matching the existing get_test
and _load_batch_set call sites, while still preserving the exception propagation
behavior currently achieved by future.result().
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository UI
Review profile: CHILL
Plan: Pro
Run ID: f0b6ac40-63bc-4117-8027-3b12b765ca99
📒 Files selected for processing (2)
deepmd/utils/data.pysource/tests/common/test_deepmd_data.py
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## master #5709 +/- ##
==========================================
- Coverage 81.97% 81.09% -0.88%
==========================================
Files 959 980 +21
Lines 105748 109355 +3607
Branches 4102 4207 +105
==========================================
+ Hits 86684 88681 +1997
- Misses 17573 19147 +1574
- Partials 1491 1527 +36 ☔ View full report in Codecov by Harness. 🚀 New features to boost your workflow:
|
Problem
DeepmdData.get_single_frame()runs the data modifier through a one-offThreadPoolExecutorbut never callsfuture.result(). The context manager waits for the submitted task to finish, but any exception raised insidemodifier.modify_data(...)stays stored on theFutureand is never surfaced. As a result, frame-level data loading silently ignores modifier failures: callers receive the unmodified frame, and when caching is enabled that bad frame can be cached for future use. Other code paths call the modifier directly and do propagate errors, so the behavior was inconsistent.Fix
Call
future.result()before leaving the modifier block so an exception raised inside the modifier propagates instead of being swallowed (and the unmodified frame cached).Test
A new test constructs a
DeepmdDatawith a modifier whosemodify_dataalways raises, and asserts thatget_single_framere-raises the error and does not cache the failed frame. On the current code the error is swallowed (no exception, frame cached); after the fix the error propagates.Fix #5690
Summary by CodeRabbit